Overview

Dataset statistics

Number of variables14
Number of observations24767
Missing cells24767
Missing cells (%)7.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 MiB
Average record size in memory112.0 B

Variable types

Numeric9
Categorical4
Unsupported1

Warnings

is_failed has constant value "1" Constant
customer_id has a high cardinality: 15892 distinct values High cardinality
order_date has a high cardinality: 734 distinct values High cardinality
is_failed is highly correlated with payment_idHigh correlation
payment_id is highly correlated with is_failedHigh correlation
customer_order_rank has 24767 (100.0%) missing values Missing
voucher_amount is highly skewed (γ1 = 44.89247979) Skewed
amount_paid is highly skewed (γ1 = 44.22375169) Skewed
platform_id is highly skewed (γ1 = -25.89811096) Skewed
df_index has unique values Unique
customer_order_rank is an unsupported type, check if it needs cleaning or further analysis Unsupported
order_hour has 392 (1.6%) zeros Zeros
voucher_amount has 23086 (93.2%) zeros Zeros
delivery_fee has 18507 (74.7%) zeros Zeros

Reproduction

Analysis started2021-02-25 23:04:09.905636
Analysis finished2021-02-25 23:04:21.577354
Duration11.67 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct24767
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean396083.3167
Minimum160
Maximum786595
Zeros0
Zeros (%)0.0%
Memory size193.6 KiB
2021-02-25T17:04:21.658946image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum160
5-th percentile39104.3
Q1201279.5
median395098
Q3593741
95-th percentile752565.6
Maximum786595
Range786435
Interquartile range (IQR)392461.5

Descriptive statistics

Standard deviation227928.2356
Coefficient of variation (CV)0.5754552793
Kurtosis-1.194618779
Mean396083.3167
Median Absolute Deviation (MAD)195844
Skewness0.001920806378
Sum9809795504
Variance5.19512806 × 1010
MonotocityStrictly increasing
2021-02-25T17:04:21.813778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1310721
 
< 0.1%
1716521
 
< 0.1%
1303251
 
< 0.1%
3860861
 
< 0.1%
1180351
 
< 0.1%
3949081
 
< 0.1%
7633731
 
< 0.1%
1200801
 
< 0.1%
3392151
 
< 0.1%
422541
 
< 0.1%
Other values (24757)24757
> 99.9%
ValueCountFrequency (%)
1601
< 0.1%
2261
< 0.1%
2301
< 0.1%
2331
< 0.1%
2461
< 0.1%
ValueCountFrequency (%)
7865951
< 0.1%
7865931
< 0.1%
7864231
< 0.1%
7863721
< 0.1%
7863651
< 0.1%

customer_id
Categorical

HIGH CARDINALITY

Distinct15892
Distinct (%)64.2%
Missing0
Missing (%)0.0%
Memory size193.6 KiB
d3c48a29c3bb
 
91
41708e47a759
 
73
068a8ca306aa
 
59
9a548f7fef25
 
55
52ecbd9f90cf
 
53
Other values (15887)
24436 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters297204
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12067 ?
Unique (%)48.7%

Sample

1st row000afe75dc19
2nd row0011683ab734
3rd row0011f9b2693b
4th row0011f9b2693b
5th row0012b1474694
ValueCountFrequency (%)
d3c48a29c3bb91
 
0.4%
41708e47a75973
 
0.3%
068a8ca306aa59
 
0.2%
9a548f7fef2555
 
0.2%
52ecbd9f90cf53
 
0.2%
c4341960929643
 
0.2%
894f8ca0d99439
 
0.2%
f8b74c17e2f037
 
0.1%
da2c86a726ad36
 
0.1%
50df4d1dbeba34
 
0.1%
Other values (15882)24247
97.9%
2021-02-25T17:04:22.139017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
d3c48a29c3bb91
 
0.4%
41708e47a75973
 
0.3%
068a8ca306aa59
 
0.2%
9a548f7fef2555
 
0.2%
52ecbd9f90cf53
 
0.2%
c4341960929643
 
0.2%
894f8ca0d99439
 
0.2%
f8b74c17e2f037
 
0.1%
da2c86a726ad36
 
0.1%
50df4d1dbeba34
 
0.1%
Other values (15882)24247
97.9%

Most occurring characters

ValueCountFrequency (%)
419161
 
6.4%
f19037
 
6.4%
a18744
 
6.3%
e18665
 
6.3%
218657
 
6.3%
918570
 
6.2%
818557
 
6.2%
018550
 
6.2%
618546
 
6.2%
118541
 
6.2%
Other values (6)110176
37.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number185684
62.5%
Lowercase Letter111520
37.5%

Most frequent character per category

ValueCountFrequency (%)
419161
10.3%
218657
10.0%
918570
10.0%
818557
10.0%
018550
10.0%
618546
10.0%
118541
10.0%
718528
10.0%
518343
9.9%
318231
9.8%
ValueCountFrequency (%)
f19037
17.1%
a18744
16.8%
e18665
16.7%
c18417
16.5%
b18382
16.5%
d18275
16.4%

Most occurring scripts

ValueCountFrequency (%)
Common185684
62.5%
Latin111520
37.5%

Most frequent character per script

ValueCountFrequency (%)
419161
10.3%
218657
10.0%
918570
10.0%
818557
10.0%
018550
10.0%
618546
10.0%
118541
10.0%
718528
10.0%
518343
9.9%
318231
9.8%
ValueCountFrequency (%)
f19037
17.1%
a18744
16.8%
e18665
16.7%
c18417
16.5%
b18382
16.5%
d18275
16.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII297204
100.0%

Most frequent character per block

ValueCountFrequency (%)
419161
 
6.4%
f19037
 
6.4%
a18744
 
6.3%
e18665
 
6.3%
218657
 
6.3%
918570
 
6.2%
818557
 
6.2%
018550
 
6.2%
618546
 
6.2%
118541
 
6.2%
Other values (6)110176
37.1%

order_date
Categorical

HIGH CARDINALITY

Distinct734
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size193.6 KiB
2016-11-24
 
160
2016-10-03
 
146
2015-04-28
 
141
2017-01-01
 
127
2015-03-27
 
109
Other values (729)
24084 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters247670
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row2015-06-03
2nd row2017-02-05
3rd row2016-09-01
4th row2016-09-05
5th row2017-02-12
ValueCountFrequency (%)
2016-11-24160
 
0.6%
2016-10-03146
 
0.6%
2015-04-28141
 
0.6%
2017-01-01127
 
0.5%
2015-03-27109
 
0.4%
2016-08-28103
 
0.4%
2016-12-24102
 
0.4%
2015-08-0499
 
0.4%
2016-11-1399
 
0.4%
2016-09-0496
 
0.4%
Other values (724)23585
95.2%
2021-02-25T17:04:22.403579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2016-11-24160
 
0.6%
2016-10-03146
 
0.6%
2015-04-28141
 
0.6%
2017-01-01127
 
0.5%
2015-03-27109
 
0.4%
2016-08-28103
 
0.4%
2016-12-24102
 
0.4%
2015-08-0499
 
0.4%
2016-11-1399
 
0.4%
2016-09-0496
 
0.4%
Other values (724)23585
95.2%

Most occurring characters

ValueCountFrequency (%)
054442
22.0%
-49534
20.0%
147333
19.1%
240298
16.3%
619761
 
8.0%
510729
 
4.3%
77353
 
3.0%
35131
 
2.1%
84605
 
1.9%
44273
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number198136
80.0%
Dash Punctuation49534
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
054442
27.5%
147333
23.9%
240298
20.3%
619761
 
10.0%
510729
 
5.4%
77353
 
3.7%
35131
 
2.6%
84605
 
2.3%
44273
 
2.2%
94211
 
2.1%
ValueCountFrequency (%)
-49534
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common247670
100.0%

Most frequent character per script

ValueCountFrequency (%)
054442
22.0%
-49534
20.0%
147333
19.1%
240298
16.3%
619761
 
8.0%
510729
 
4.3%
77353
 
3.0%
35131
 
2.1%
84605
 
1.9%
44273
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII247670
100.0%

Most frequent character per block

ValueCountFrequency (%)
054442
22.0%
-49534
20.0%
147333
19.1%
240298
16.3%
619761
 
8.0%
510729
 
4.3%
77353
 
3.0%
35131
 
2.1%
84605
 
1.9%
44273
 
1.7%

order_hour
Real number (ℝ≥0)

ZEROS

Distinct24
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.2271571
Minimum0
Maximum23
Zeros392
Zeros (%)1.6%
Memory size193.6 KiB
2021-02-25T17:04:22.493862image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q115
median18
Q320
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.455628025
Coefficient of variation (CV)0.2586397743
Kurtosis3.986340157
Mean17.2271571
Median Absolute Deviation (MAD)2
Skewness-1.746123381
Sum426665
Variance19.8526211
MonotocityNot monotonic
2021-02-25T17:04:22.600748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
193456
14.0%
203195
12.9%
183037
12.3%
212419
9.8%
172200
8.9%
221768
7.1%
161433
 
5.8%
141168
 
4.7%
151151
 
4.6%
131051
 
4.2%
Other values (14)3889
15.7%
ValueCountFrequency (%)
0392
1.6%
1236
1.0%
2113
 
0.5%
371
 
0.3%
424
 
0.1%
ValueCountFrequency (%)
23900
 
3.6%
221768
7.1%
212419
9.8%
203195
12.9%
193456
14.0%

customer_order_rank
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing24767
Missing (%)100.0%
Memory size193.6 KiB

is_failed
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size193.6 KiB
1
24767 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters24767
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
124767
100.0%
2021-02-25T17:04:22.809543image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-25T17:04:23.025548image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
124767
100.0%

Most occurring characters

ValueCountFrequency (%)
124767
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number24767
100.0%

Most frequent character per category

ValueCountFrequency (%)
124767
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common24767
100.0%

Most frequent character per script

ValueCountFrequency (%)
124767
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII24767
100.0%

Most frequent character per block

ValueCountFrequency (%)
124767
100.0%

voucher_amount
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct193
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1482235854
Minimum0
Maximum93.3989
Zeros23086
Zeros (%)93.2%
Memory size193.6 KiB
2021-02-25T17:04:23.098883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1.029
Maximum93.3989
Range93.3989
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.090932377
Coefficient of variation (CV)7.360045796
Kurtosis3291.572191
Mean0.1482235854
Median Absolute Deviation (MAD)0
Skewness44.89247979
Sum3671.05354
Variance1.19013345
MonotocityNot monotonic
2021-02-25T17:04:23.222410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
023086
93.2%
1.029476
 
1.9%
2.058357
 
1.4%
1.715286
 
1.2%
0.686134
 
0.5%
1.37258
 
0.2%
2.74451
 
0.2%
2.572543
 
0.2%
5.14542
 
0.2%
3.4310
 
< 0.1%
Other values (183)224
 
0.9%
ValueCountFrequency (%)
023086
93.2%
0.003432
 
< 0.1%
0.418461
 
< 0.1%
0.51457
 
< 0.1%
0.610541
 
< 0.1%
ValueCountFrequency (%)
93.39891
< 0.1%
78.029071
< 0.1%
33.397911
< 0.1%
27.803581
< 0.1%
24.154061
< 0.1%

delivery_fee
Real number (ℝ≥0)

ZEROS

Distinct69
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2114688735
Minimum0
Maximum9.86
Zeros18507
Zeros (%)74.7%
Memory size193.6 KiB
2021-02-25T17:04:23.346449image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.2465
95-th percentile0.986
Maximum9.86
Range9.86
Interquartile range (IQR)0.2465

Descriptive statistics

Standard deviation0.4319605626
Coefficient of variation (CV)2.042667346
Kurtosis20.30196018
Mean0.2114688735
Median Absolute Deviation (MAD)0
Skewness2.961695917
Sum5237.44959
Variance0.1865899276
MonotocityNot monotonic
2021-02-25T17:04:23.472049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018507
74.7%
0.4932036
 
8.2%
0.9861197
 
4.8%
0.73951077
 
4.3%
1.4297391
 
1.6%
1.479312
 
1.3%
1.2325252
 
1.0%
0.2465206
 
0.8%
0.46835108
 
0.4%
1.725571
 
0.3%
Other values (59)610
 
2.5%
ValueCountFrequency (%)
018507
74.7%
0.04931
 
< 0.1%
0.09863
 
< 0.1%
0.14797
 
< 0.1%
0.221853
 
< 0.1%
ValueCountFrequency (%)
9.861
 
< 0.1%
6.65551
 
< 0.1%
5.9161
 
< 0.1%
4.933
< 0.1%
3.9443
< 0.1%

amount_paid
Real number (ℝ≥0)

SKEWED

Distinct2533
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.55155809
Minimum0
Maximum1131.03
Zeros46
Zeros (%)0.2%
Memory size193.6 KiB
2021-02-25T17:04:23.610580image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.248
Q16.372
median8.7615
Q312.056355
95-th percentile21.2931
Maximum1131.03
Range1131.03
Interquartile range (IQR)5.684355

Descriptive statistics

Standard deviation11.79499819
Coefficient of variation (CV)1.117844216
Kurtosis3610.45244
Mean10.55155809
Median Absolute Deviation (MAD)2.655
Skewness44.22375169
Sum261330.4393
Variance139.1219822
MonotocityNot monotonic
2021-02-25T17:04:23.734515image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.31576
 
2.3%
7.965519
 
2.1%
6.372409
 
1.7%
5.841362
 
1.5%
8.496327
 
1.3%
10.62321
 
1.3%
6.903308
 
1.2%
5.5755294
 
1.2%
9.027284
 
1.1%
6.6375277
 
1.1%
Other values (2523)21090
85.2%
ValueCountFrequency (%)
046
0.2%
0.015931
 
< 0.1%
0.05311
 
< 0.1%
0.175231
 
< 0.1%
0.238951
 
< 0.1%
ValueCountFrequency (%)
1131.031
< 0.1%
581.71051
< 0.1%
363.018151
< 0.1%
353.38051
< 0.1%
246.888451
< 0.1%

restaurant_id
Real number (ℝ≥0)

Distinct7946
Distinct (%)32.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean174489029.2
Minimum73498
Maximum339823498
Zeros0
Zeros (%)0.0%
Memory size193.6 KiB
2021-02-25T17:04:23.862521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum73498
5-th percentile31483498
Q196423498
median184773498
Q3243643498
95-th percentile304893498
Maximum339823498
Range339750000
Interquartile range (IQR)147220000

Descriptive statistics

Standard deviation88454059.81
Coefficient of variation (CV)0.5069319271
Kurtosis-1.086619904
Mean174489029.2
Median Absolute Deviation (MAD)72620000
Skewness-0.1783513105
Sum4.321569785 × 1012
Variance7.824120697 × 1015
MonotocityNot monotonic
2021-02-25T17:04:23.994199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4807349874
 
0.3%
15454349850
 
0.2%
8877349839
 
0.2%
15977349836
 
0.1%
22724349835
 
0.1%
30559349834
 
0.1%
5492349832
 
0.1%
6643349831
 
0.1%
18384349830
 
0.1%
7801349829
 
0.1%
Other values (7936)24377
98.4%
ValueCountFrequency (%)
734981
< 0.1%
1234981
< 0.1%
1534981
< 0.1%
1734981
< 0.1%
1934981
< 0.1%
ValueCountFrequency (%)
3398234981
< 0.1%
3386334981
< 0.1%
3381234981
< 0.1%
3380534981
< 0.1%
3375734981
< 0.1%

city_id
Real number (ℝ≥0)

Distinct1674
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47062.55312
Minimum230
Maximum100048
Zeros0
Zeros (%)0.0%
Memory size193.6 KiB
2021-02-25T17:04:24.126530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum230
5-th percentile10346
Q124929.5
median46276
Q367290
95-th percentile90633
Maximum100048
Range99818
Interquartile range (IQR)42360.5

Descriptive statistics

Standard deviation25751.38346
Coefficient of variation (CV)0.5471735331
Kurtosis-0.9935263223
Mean47062.55312
Median Absolute Deviation (MAD)21108
Skewness0.05639879334
Sum1165598253
Variance663133750.2
MonotocityNot monotonic
2021-02-25T17:04:24.252153image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
103462839
 
11.5%
203261124
 
4.5%
805621081
 
4.4%
50898588
 
2.4%
60537554
 
2.2%
40441527
 
2.1%
44366475
 
1.9%
90633377
 
1.5%
45358318
 
1.3%
47282306
 
1.2%
Other values (1664)16578
66.9%
ValueCountFrequency (%)
23038
 
0.2%
1298185
0.7%
16763
 
< 0.1%
16891
 
< 0.1%
16991
 
< 0.1%
ValueCountFrequency (%)
1000481
 
< 0.1%
999833
 
< 0.1%
9996524
0.1%
998561
 
< 0.1%
998411
 
< 0.1%

payment_id
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size193.6 KiB
1779
10951 
1619
7319 
1491
3657 
1811
2592 
1523
 
248

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters99068
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1779
2nd row1811
3rd row1779
4th row1779
5th row1779
ValueCountFrequency (%)
177910951
44.2%
16197319
29.6%
14913657
 
14.8%
18112592
 
10.5%
1523248
 
1.0%
2021-02-25T17:04:24.483882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-25T17:04:24.557533image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
177910951
44.2%
16197319
29.6%
14913657
 
14.8%
18112592
 
10.5%
1523248
 
1.0%

Most occurring characters

ValueCountFrequency (%)
140927
41.3%
921927
22.1%
721902
22.1%
67319
 
7.4%
43657
 
3.7%
82592
 
2.6%
5248
 
0.3%
2248
 
0.3%
3248
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number99068
100.0%

Most frequent character per category

ValueCountFrequency (%)
140927
41.3%
921927
22.1%
721902
22.1%
67319
 
7.4%
43657
 
3.7%
82592
 
2.6%
5248
 
0.3%
2248
 
0.3%
3248
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Common99068
100.0%

Most frequent character per script

ValueCountFrequency (%)
140927
41.3%
921927
22.1%
721902
22.1%
67319
 
7.4%
43657
 
3.7%
82592
 
2.6%
5248
 
0.3%
2248
 
0.3%
3248
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII99068
100.0%

Most frequent character per block

ValueCountFrequency (%)
140927
41.3%
921927
22.1%
721902
22.1%
67319
 
7.4%
43657
 
3.7%
82592
 
2.6%
5248
 
0.3%
2248
 
0.3%
3248
 
0.3%

platform_id
Real number (ℝ≥0)

SKEWED

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29889.68922
Minimum525
Maximum30423
Zeros0
Zeros (%)0.0%
Memory size193.6 KiB
2021-02-25T17:04:24.655474image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum525
5-th percentile29463
Q129463
median29815
Q330231
95-th percentile30359
Maximum30423
Range29898
Interquartile range (IQR)768

Descriptive statistics

Standard deviation955.3340507
Coefficient of variation (CV)0.03196199343
Kurtosis791.0017466
Mean29889.68922
Median Absolute Deviation (MAD)352
Skewness-25.89811096
Sum740277933
Variance912663.1485
MonotocityNot monotonic
2021-02-25T17:04:24.752874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
294638037
32.5%
302317637
30.8%
298154338
17.5%
303593274
13.2%
30391676
 
2.7%
29751410
 
1.7%
29495175
 
0.7%
30423103
 
0.4%
3019970
 
0.3%
52522
 
0.1%
Other values (3)25
 
0.1%
ValueCountFrequency (%)
52522
 
0.1%
221671
 
< 0.1%
222634
 
< 0.1%
294638037
32.5%
29495175
 
0.7%
ValueCountFrequency (%)
30423103
 
0.4%
30391676
 
2.7%
303593274
13.2%
302317637
30.8%
3019970
 
0.3%

transmission_id
Real number (ℝ≥0)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2220.000323
Minimum212
Maximum21124
Zeros0
Zeros (%)0.0%
Memory size193.6 KiB
2021-02-25T17:04:24.862802image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum212
5-th percentile212
Q1212
median212
Q34324
95-th percentile4356
Maximum21124
Range20912
Interquartile range (IQR)4112

Descriptive statistics

Standard deviation2058.199582
Coefficient of variation (CV)0.9271167939
Kurtosis-1.712491168
Mean2220.000323
Median Absolute Deviation (MAD)0
Skewness0.07884075779
Sum54982748
Variance4236185.519
MonotocityNot monotonic
2021-02-25T17:04:24.951691image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
21212666
51.1%
43565617
22.7%
42283163
 
12.8%
43242980
 
12.0%
4260163
 
0.7%
4996144
 
0.6%
419632
 
0.1%
20201
 
< 0.1%
211241
 
< 0.1%
ValueCountFrequency (%)
21212666
51.1%
20201
 
< 0.1%
419632
 
0.1%
42283163
 
12.8%
4260163
 
0.7%
ValueCountFrequency (%)
211241
 
< 0.1%
4996144
 
0.6%
43565617
22.7%
43242980
12.0%
4260163
 
0.7%

Interactions

2021-02-25T17:04:11.323173image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:11.464999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:11.600670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:11.729648image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:11.856748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:11.990698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:12.116677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:12.246989image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:12.373393image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:12.493412image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:12.598135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.048254image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.156946image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.271871image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.379386image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.491169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.600763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.731678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.836040image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:13.943759image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.052232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.168550image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.277321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.389800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.505877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.646285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.770098image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:14.897276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.023211image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.162509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.284777image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.417473image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.543047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.687021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.810972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:15.934244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.048350image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.176931image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.302001image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.436553image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.564929image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.719949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:16.859922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.000980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.143215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.285340image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.431227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.578158image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.835260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:17.976295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.098204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.215218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.340304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.473265image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.615816image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.752291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:18.882331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.017375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.130631image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.245463image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.362095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.482374image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.608180image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.734244image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.856107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:19.976504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.083007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.189682image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.300677image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.412030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.531860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-25T17:04:20.642027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-02-25T17:04:25.058433image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-25T17:04:25.297231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-25T17:04:25.528287image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-25T17:04:25.756250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-25T17:04:25.931371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-25T17:04:20.915446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-25T17:04:21.242780image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-25T17:04:21.438689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexcustomer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_id
0160000afe75dc192015-06-0321NaN10.0000.000007.43400161043498973011779297514356
12260011683ab7342017-02-0518NaN11.0291.232508.7615027471349864562181130231212
22300011f9b2693b2016-09-0114NaN10.0000.9810714.0024715955349839335177929463212
32330011f9b2693b2016-09-0511NaN10.0000.986008.7615018721349839335177929463212
42460012b14746942017-02-1218NaN10.0000.0000010.0359031754349824334177929463212
525900144f5638a12016-05-200NaN10.0000.000005.8410047643498284261523298154324
6312001a947e5f302015-11-2914NaN10.0001.4297012.3192014589349880562177929815212
7334001ae9f685332017-01-3120NaN10.0000.739507.2216030703349845358177929463212
855800337d8863ff2016-06-1621NaN10.0000.7395012.4785021884349838407181130231212
9568003405d5944f2017-02-2617NaN10.0000.4930022.14270184463498637171619294634356

Last rows

df_indexcustomer_idorder_dateorder_hourcustomer_order_rankis_failedvoucher_amountdelivery_feeamount_paidrestaurant_idcity_idpayment_idplatform_idtransmission_id
24757786349ffe2a6942fd22015-08-1719NaN10.00.04.51357664349879770181130231212
24758786350ffe2a6942fd22015-08-1719NaN10.00.06.05347664349879770181130231212
24759786358ffe2a6942fd22015-10-0820NaN10.00.011.151018827349879920181130231212
24760786359ffe2a6942fd22015-10-1120NaN10.00.07.96508466349879920181130231212
24761786364ffe2a6942fd22015-11-2321NaN10.00.07.96508466349879920177929463212
24762786365ffe2a6942fd22015-11-2321NaN10.00.07.96508466349879920177930231212
24763786372ffe2a6942fd22016-01-2821NaN10.00.09.29258466349879920177930231212
24764786423ffe7577807b82016-10-0321NaN10.00.07.699513151349873661177930231212
24765786593fffe9d5a8d412016-07-3121NaN10.00.08.442915613349810346181129463212
24766786595fffe9d5a8d412016-09-3020NaN10.00.010.726298349810346177929463212